Overview

Dataset statistics

Number of variables28
Number of observations117405
Missing cells0
Missing cells (%)0.0%
Duplicate rows1096
Duplicate rows (%)0.9%
Total size in memory20.4 MiB
Average record size in memory182.0 B

Variable types

Numeric11
Categorical11
Boolean6

Warnings

Reason has constant value "TB" Constant
Is_year_start has constant value "False" Constant
Dataset has 1096 (0.9%) duplicate rows Duplicates
Id has a high cardinality: 22217 distinct values High cardinality
Applied is highly correlated with Received and 2 other fieldsHigh correlation
Received is highly correlated with Applied and 2 other fieldsHigh correlation
logapplied is highly correlated with Applied and 2 other fieldsHigh correlation
logreceived is highly correlated with Applied and 2 other fieldsHigh correlation
Year is highly correlated with ElapsedHigh correlation
Month is highly correlated with Week and 1 other fieldsHigh correlation
Week is highly correlated with Month and 1 other fieldsHigh correlation
Dayofyear is highly correlated with Month and 1 other fieldsHigh correlation
Elapsed is highly correlated with YearHigh correlation
Gender is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Is_month_end is highly correlated with Is_year_start and 1 other fieldsHigh correlation
True_False is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Payment_Method is highly correlated with Is_year_start and 2 other fieldsHigh correlation
Location is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Area is highly correlated with Is_year_start and 1 other fieldsHigh correlation
AgeGroup is highly correlated with Is_year_start and 2 other fieldsHigh correlation
Is_month_start is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Is_quarter_start is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Is_year_end is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Is_year_start is highly correlated with Gender and 14 other fieldsHigh correlation
Year is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Is_quarter_end is highly correlated with Is_year_start and 1 other fieldsHigh correlation
Age is highly correlated with AgeGroup and 2 other fieldsHigh correlation
Payment_Type is highly correlated with Payment_Method and 2 other fieldsHigh correlation
Reason is highly correlated with Gender and 14 other fieldsHigh correlation
Ratio is highly skewed (γ1 = -135.0970658) Skewed
Dayofweek has 21122 (18.0%) zeros Zeros

Reproduction

Analysis started2021-04-26 18:49:36.925979
Analysis finished2021-04-26 18:51:49.787225
Duration2 minutes and 12.86 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

Applied
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2043
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean920.5182914
Minimum1
Maximum21060
Zeros0
Zeros (%)0.0%
Memory size917.4 KiB
2021-04-27T00:21:53.523571image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile134
Q1360
median800
Q31350
95-th percentile2120
Maximum21060
Range21059
Interquartile range (IQR)990

Descriptive statistics

Standard deviation645.3930638
Coefficient of variation (CV)0.7011192171
Kurtosis8.808821341
Mean920.5182914
Median Absolute Deviation (MAD)480
Skewness1.081168702
Sum108073450
Variance416532.2068
MonotocityNot monotonic
2021-04-27T00:21:53.804755image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10002785
 
2.4%
12002669
 
2.3%
6002612
 
2.2%
2102256
 
1.9%
14002181
 
1.9%
8001821
 
1.6%
5001780
 
1.5%
18001715
 
1.5%
9001547
 
1.3%
4001540
 
1.3%
Other values (2033)96499
82.2%
ValueCountFrequency (%)
11
 
< 0.1%
24
< 0.1%
34
< 0.1%
41
 
< 0.1%
52
< 0.1%
ValueCountFrequency (%)
210601
< 0.1%
112201
< 0.1%
61401
< 0.1%
60001
< 0.1%
46481
< 0.1%

Gender
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size917.4 KiB
F
75095 
M
42279 
GD
 
31

Length

Max length2
Median length1
Mean length1.000264043
Min length1

Characters and Unicode

Total characters117436
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowF
3rd rowM
4th rowM
5th rowF
ValueCountFrequency (%)
F75095
64.0%
M42279
36.0%
GD31
 
< 0.1%
2021-04-27T00:21:54.354076image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T00:21:54.575056image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
f75095
64.0%
m42279
36.0%
gd31
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
F75095
63.9%
M42279
36.0%
G31
 
< 0.1%
D31
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter117436
100.0%

Most frequent character per category

ValueCountFrequency (%)
F75095
63.9%
M42279
36.0%
G31
 
< 0.1%
D31
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin117436
100.0%

Most frequent character per script

ValueCountFrequency (%)
F75095
63.9%
M42279
36.0%
G31
 
< 0.1%
D31
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII117436
100.0%

Most frequent character per block

ValueCountFrequency (%)
F75095
63.9%
M42279
36.0%
G31
 
< 0.1%
D31
 
< 0.1%

Payment_Method
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size917.4 KiB
AV
100828 
RP
16576 
U
 
1

Length

Max length2
Median length2
Mean length1.999991482
Min length1

Characters and Unicode

Total characters234809
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowAV
2nd rowRP
3rd rowAV
4th rowAV
5th rowAV
ValueCountFrequency (%)
AV100828
85.9%
RP16576
 
14.1%
U1
 
< 0.1%
2021-04-27T00:21:55.107429image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T00:21:55.248056image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
av100828
85.9%
rp16576
 
14.1%
u1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
A100828
42.9%
V100828
42.9%
R16576
 
7.1%
P16576
 
7.1%
U1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter234809
100.0%

Most frequent character per category

ValueCountFrequency (%)
A100828
42.9%
V100828
42.9%
R16576
 
7.1%
P16576
 
7.1%
U1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin234809
100.0%

Most frequent character per script

ValueCountFrequency (%)
A100828
42.9%
V100828
42.9%
R16576
 
7.1%
P16576
 
7.1%
U1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII234809
100.0%

Most frequent character per block

ValueCountFrequency (%)
A100828
42.9%
V100828
42.9%
R16576
 
7.1%
P16576
 
7.1%
U1
 
< 0.1%

Location
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size917.4 KiB
M
52406 
NE
36752 
O
12506 
PP
11494 
U
 
4247

Length

Max length2
Median length1
Mean length1.410936502
Min length1

Characters and Unicode

Total characters165651
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowNE
3rd rowM
4th rowM
5th rowM
ValueCountFrequency (%)
M52406
44.6%
NE36752
31.3%
O12506
 
10.7%
PP11494
 
9.8%
U4247
 
3.6%
2021-04-27T00:21:55.687858image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T00:21:55.875309image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
m52406
44.6%
ne36752
31.3%
o12506
 
10.7%
pp11494
 
9.8%
u4247
 
3.6%

Most occurring characters

ValueCountFrequency (%)
M52406
31.6%
N36752
22.2%
E36752
22.2%
P22988
13.9%
O12506
 
7.5%
U4247
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter165651
100.0%

Most frequent character per category

ValueCountFrequency (%)
M52406
31.6%
N36752
22.2%
E36752
22.2%
P22988
13.9%
O12506
 
7.5%
U4247
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Latin165651
100.0%

Most frequent character per script

ValueCountFrequency (%)
M52406
31.6%
N36752
22.2%
E36752
22.2%
P22988
13.9%
O12506
 
7.5%
U4247
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII165651
100.0%

Most frequent character per block

ValueCountFrequency (%)
M52406
31.6%
N36752
22.2%
E36752
22.2%
P22988
13.9%
O12506
 
7.5%
U4247
 
2.6%

Received
Real number (ℝ≥0)

HIGH CORRELATION

Distinct3645
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean920.5135505
Minimum1
Maximum21060
Zeros0
Zeros (%)0.0%
Memory size917.4 KiB
2021-04-27T00:21:56.131457image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile134
Q1360
median800
Q31350
95-th percentile2120
Maximum21060
Range21059
Interquartile range (IQR)990

Descriptive statistics

Standard deviation645.3905099
Coefficient of variation (CV)0.7011200536
Kurtosis8.808916907
Mean920.5135505
Median Absolute Deviation (MAD)480
Skewness1.081163686
Sum108072893.4
Variance416528.9102
MonotocityNot monotonic
2021-04-27T00:21:56.393183image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10002785
 
2.4%
12002669
 
2.3%
6002612
 
2.2%
2102255
 
1.9%
14002181
 
1.9%
8001819
 
1.5%
5001775
 
1.5%
18001715
 
1.5%
9001547
 
1.3%
4001540
 
1.3%
Other values (3635)96507
82.2%
ValueCountFrequency (%)
11
 
< 0.1%
23
< 0.1%
2.11
 
< 0.1%
2.52
< 0.1%
2.581
 
< 0.1%
ValueCountFrequency (%)
210601
< 0.1%
112201
< 0.1%
61401
< 0.1%
60001
< 0.1%
4647.91
< 0.1%

Id
Categorical

HIGH CARDINALITY

Distinct22217
Distinct (%)18.9%
Missing0
Missing (%)0.0%
Memory size917.4 KiB
GHI000112669
11022 
GHI000753413
 
1149
GHI001206283
 
986
GHI000143648
 
768
GHI000134418
 
576
Other values (22212)
102904 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters1408860
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14612 ?
Unique (%)12.4%

Sample

1st rowGHI000112669
2nd rowGHI000780038
3rd rowGHI000437510
4th rowGHI000140582
5th rowGHI001420547
ValueCountFrequency (%)
GHI00011266911022
 
9.4%
GHI0007534131149
 
1.0%
GHI001206283986
 
0.8%
GHI000143648768
 
0.7%
GHI000134418576
 
0.5%
GHI000437510552
 
0.5%
GHI001853440549
 
0.5%
GHI001086470541
 
0.5%
GHI000086558538
 
0.5%
GHI000100619526
 
0.4%
Other values (22207)100198
85.3%
2021-04-27T00:21:57.203888image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ghi00011266911022
 
9.4%
ghi0007534131149
 
1.0%
ghi001206283986
 
0.8%
ghi000143648768
 
0.7%
ghi000134418576
 
0.5%
ghi000437510552
 
0.5%
ghi001853440549
 
0.5%
ghi001086470541
 
0.5%
ghi000086558538
 
0.5%
ghi000100619526
 
0.4%
Other values (22207)100198
85.3%

Most occurring characters

ValueCountFrequency (%)
0392591
27.9%
1130709
 
9.3%
G117405
 
8.3%
H117405
 
8.3%
I117405
 
8.3%
681309
 
5.8%
270778
 
5.0%
968849
 
4.9%
467571
 
4.8%
863924
 
4.5%
Other values (3)180914
12.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1056645
75.0%
Uppercase Letter352215
 
25.0%

Most frequent character per category

ValueCountFrequency (%)
0392591
37.2%
1130709
 
12.4%
681309
 
7.7%
270778
 
6.7%
968849
 
6.5%
467571
 
6.4%
863924
 
6.0%
361966
 
5.9%
561074
 
5.8%
757874
 
5.5%
ValueCountFrequency (%)
G117405
33.3%
H117405
33.3%
I117405
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common1056645
75.0%
Latin352215
 
25.0%

Most frequent character per script

ValueCountFrequency (%)
0392591
37.2%
1130709
 
12.4%
681309
 
7.7%
270778
 
6.7%
968849
 
6.5%
467571
 
6.4%
863924
 
6.0%
361966
 
5.9%
561074
 
5.8%
757874
 
5.5%
ValueCountFrequency (%)
G117405
33.3%
H117405
33.3%
I117405
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1408860
100.0%

Most frequent character per block

ValueCountFrequency (%)
0392591
27.9%
1130709
 
9.3%
G117405
 
8.3%
H117405
 
8.3%
I117405
 
8.3%
681309
 
5.8%
270778
 
5.0%
968849
 
4.9%
467571
 
4.8%
863924
 
4.5%
Other values (3)180914
12.8%

Reason
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size917.4 KiB
TB
117405 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters234810
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTB
2nd rowTB
3rd rowTB
4th rowTB
5th rowTB
ValueCountFrequency (%)
TB117405
100.0%
2021-04-27T00:21:57.774025image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T00:21:57.914616image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
tb117405
100.0%

Most occurring characters

ValueCountFrequency (%)
T117405
50.0%
B117405
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter234810
100.0%

Most frequent character per category

ValueCountFrequency (%)
T117405
50.0%
B117405
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin234810
100.0%

Most frequent character per script

ValueCountFrequency (%)
T117405
50.0%
B117405
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII234810
100.0%

Most frequent character per block

ValueCountFrequency (%)
T117405
50.0%
B117405
50.0%

Age
Categorical

HIGH CORRELATION

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size917.4 KiB
25-29
21016 
20-24
18587 
30-34
16859 
35-39
12991 
40-44
10506 
Other values (8)
37446 

Length

Max length5
Median length5
Mean length4.885268941
Min length2

Characters and Unicode

Total characters573555
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row40-44
2nd row18-19
3rd row35-39
4th row55-59
5th row30-34
ValueCountFrequency (%)
25-2921016
17.9%
20-2418587
15.8%
30-3416859
14.4%
35-3912991
11.1%
40-4410506
8.9%
45-499398
8.0%
50-547349
 
6.3%
65+5973
 
5.1%
55-595690
 
4.8%
18-194414
 
3.8%
Other values (3)4622
 
3.9%
2021-04-27T00:21:58.307467image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
25-2921016
17.9%
20-2418587
15.8%
30-3416859
14.4%
35-3912991
11.1%
40-4410506
8.9%
45-499398
8.0%
50-547349
 
6.3%
655973
 
5.1%
55-595690
 
4.8%
18-194414
 
3.8%
Other values (3)4622
 
3.9%

Most occurring characters

ValueCountFrequency (%)
-110924
19.3%
497223
17.0%
581146
14.1%
279206
13.8%
359700
10.4%
057415
10.0%
953509
9.3%
614299
 
2.5%
19336
 
1.6%
+5973
 
1.0%
Other values (2)4824
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number456658
79.6%
Dash Punctuation110924
 
19.3%
Math Symbol5973
 
1.0%

Most frequent character per category

ValueCountFrequency (%)
497223
21.3%
581146
17.8%
279206
17.3%
359700
13.1%
057415
12.6%
953509
11.7%
614299
 
3.1%
19336
 
2.0%
84414
 
1.0%
7410
 
0.1%
ValueCountFrequency (%)
-110924
100.0%
ValueCountFrequency (%)
+5973
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common573555
100.0%

Most frequent character per script

ValueCountFrequency (%)
-110924
19.3%
497223
17.0%
581146
14.1%
279206
13.8%
359700
10.4%
057415
10.0%
953509
9.3%
614299
 
2.5%
19336
 
1.6%
+5973
 
1.0%
Other values (2)4824
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII573555
100.0%

Most frequent character per block

ValueCountFrequency (%)
-110924
19.3%
497223
17.0%
581146
14.1%
279206
13.8%
359700
10.4%
057415
10.0%
953509
9.3%
614299
 
2.5%
19336
 
1.6%
+5973
 
1.0%
Other values (2)4824
 
0.8%

Area
Categorical

HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size917.4 KiB
AM
32518 
O
26530 
C
14173 
BP
8627 
W
7766 
Other values (6)
27791 

Length

Max length3
Median length1
Mean length1.505353264
Min length1

Characters and Unicode

Total characters176736
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowO
2nd rowWlg
3rd rowAM
4th rowC
5th rowC
ValueCountFrequency (%)
AM32518
27.7%
O26530
22.6%
C14173
12.1%
BP8627
 
7.3%
W7766
 
6.6%
T6066
 
5.2%
S6012
 
5.1%
Wlg5238
 
4.5%
EC3958
 
3.4%
NL3752
 
3.2%
2021-04-27T00:21:59.044539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
am32518
27.7%
o26530
22.6%
c14173
12.1%
bp8627
 
7.3%
w7766
 
6.6%
t6066
 
5.2%
s6012
 
5.1%
wlg5238
 
4.5%
ec3958
 
3.4%
nl3752
 
3.2%

Most occurring characters

ValueCountFrequency (%)
A32518
18.4%
M32518
18.4%
O26530
15.0%
C18131
10.3%
W13004
 
7.4%
B8627
 
4.9%
P8627
 
4.9%
N6517
 
3.7%
T6066
 
3.4%
S6012
 
3.4%
Other values (4)18186
10.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter166260
94.1%
Lowercase Letter10476
 
5.9%

Most frequent character per category

ValueCountFrequency (%)
A32518
19.6%
M32518
19.6%
O26530
16.0%
C18131
10.9%
W13004
 
7.8%
B8627
 
5.2%
P8627
 
5.2%
N6517
 
3.9%
T6066
 
3.6%
S6012
 
3.6%
Other values (2)7710
 
4.6%
ValueCountFrequency (%)
l5238
50.0%
g5238
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin176736
100.0%

Most frequent character per script

ValueCountFrequency (%)
A32518
18.4%
M32518
18.4%
O26530
15.0%
C18131
10.3%
W13004
 
7.4%
B8627
 
4.9%
P8627
 
4.9%
N6517
 
3.7%
T6066
 
3.4%
S6012
 
3.4%
Other values (4)18186
10.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII176736
100.0%

Most frequent character per block

ValueCountFrequency (%)
A32518
18.4%
M32518
18.4%
O26530
15.0%
C18131
10.3%
W13004
 
7.4%
B8627
 
4.9%
P8627
 
4.9%
N6517
 
3.7%
T6066
 
3.4%
S6012
 
3.4%
Other values (4)18186
10.3%

True_False
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size917.4 KiB
0
117071 
1
 
334

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters117405
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0117071
99.7%
1334
 
0.3%
2021-04-27T00:21:59.499521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T00:21:59.640082image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0117071
99.7%
1334
 
0.3%

Most occurring characters

ValueCountFrequency (%)
0117071
99.7%
1334
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number117405
100.0%

Most frequent character per category

ValueCountFrequency (%)
0117071
99.7%
1334
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Common117405
100.0%

Most frequent character per script

ValueCountFrequency (%)
0117071
99.7%
1334
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII117405
100.0%

Most frequent character per block

ValueCountFrequency (%)
0117071
99.7%
1334
 
0.3%

AgeGroup
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size917.4 KiB
MidAge
49754 
Adult
44017 
Old
23126 
Teenage
 
508

Length

Max length7
Median length5
Mean length5.038482177
Min length3

Characters and Unicode

Total characters591543
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMidAge
2nd rowAdult
3rd rowMidAge
4th rowOld
5th rowMidAge
ValueCountFrequency (%)
MidAge49754
42.4%
Adult44017
37.5%
Old23126
19.7%
Teenage508
 
0.4%
2021-04-27T00:22:00.078821image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T00:22:00.250654image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
midage49754
42.4%
adult44017
37.5%
old23126
19.7%
teenage508
 
0.4%

Most occurring characters

ValueCountFrequency (%)
d116897
19.8%
A93771
15.9%
l67143
11.4%
e51278
8.7%
g50262
8.5%
M49754
8.4%
i49754
8.4%
u44017
 
7.4%
t44017
 
7.4%
O23126
 
3.9%
Other values (3)1524
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter424384
71.7%
Uppercase Letter167159
 
28.3%

Most frequent character per category

ValueCountFrequency (%)
d116897
27.5%
l67143
15.8%
e51278
12.1%
g50262
11.8%
i49754
11.7%
u44017
 
10.4%
t44017
 
10.4%
n508
 
0.1%
a508
 
0.1%
ValueCountFrequency (%)
A93771
56.1%
M49754
29.8%
O23126
 
13.8%
T508
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin591543
100.0%

Most frequent character per script

ValueCountFrequency (%)
d116897
19.8%
A93771
15.9%
l67143
11.4%
e51278
8.7%
g50262
8.5%
M49754
8.4%
i49754
8.4%
u44017
 
7.4%
t44017
 
7.4%
O23126
 
3.9%
Other values (3)1524
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII591543
100.0%

Most frequent character per block

ValueCountFrequency (%)
d116897
19.8%
A93771
15.9%
l67143
11.4%
e51278
8.7%
g50262
8.5%
M49754
8.4%
i49754
8.4%
u44017
 
7.4%
t44017
 
7.4%
O23126
 
3.9%
Other values (3)1524
 
0.3%

Payment_Type
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size917.4 KiB
AV
100828 
RPU
16577 

Length

Max length3
Median length2
Mean length2.141195009
Min length2

Characters and Unicode

Total characters251387
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAV
2nd rowRPU
3rd rowAV
4th rowAV
5th rowAV
ValueCountFrequency (%)
AV100828
85.9%
RPU16577
 
14.1%
2021-04-27T00:22:00.674777image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T00:22:00.815373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
av100828
85.9%
rpu16577
 
14.1%

Most occurring characters

ValueCountFrequency (%)
A100828
40.1%
V100828
40.1%
R16577
 
6.6%
P16577
 
6.6%
U16577
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter251387
100.0%

Most frequent character per category

ValueCountFrequency (%)
A100828
40.1%
V100828
40.1%
R16577
 
6.6%
P16577
 
6.6%
U16577
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Latin251387
100.0%

Most frequent character per script

ValueCountFrequency (%)
A100828
40.1%
V100828
40.1%
R16577
 
6.6%
P16577
 
6.6%
U16577
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII251387
100.0%

Most frequent character per block

ValueCountFrequency (%)
A100828
40.1%
V100828
40.1%
R16577
 
6.6%
P16577
 
6.6%
U16577
 
6.6%

logapplied
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2043
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.520006166
Minimum0
Maximum9.955130786
Zeros1
Zeros (%)< 0.1%
Memory size917.4 KiB
2021-04-27T00:22:01.004196image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.8978398
Q15.886104031
median6.684611728
Q37.207859871
95-th percentile7.659171368
Maximum9.955130786
Range9.955130786
Interquartile range (IQR)1.32175584

Descriptive statistics

Standard deviation0.8599657331
Coefficient of variation (CV)0.1318964601
Kurtosis-0.1932573719
Mean6.520006166
Median Absolute Deviation (MAD)0.5947071077
Skewness-0.6108255283
Sum765481.3239
Variance0.7395410621
MonotocityNot monotonic
2021-04-27T00:22:01.269758image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.9077552792785
 
2.4%
7.0900768362669
 
2.3%
6.3969296552612
 
2.2%
5.3471075312256
 
1.9%
7.2442275162181
 
1.9%
6.6846117281821
 
1.6%
6.2146080981780
 
1.5%
7.4955419441715
 
1.5%
6.8023947631547
 
1.3%
5.9914645471540
 
1.3%
Other values (2033)96499
82.2%
ValueCountFrequency (%)
01
 
< 0.1%
0.69314718064
< 0.1%
1.0986122894
< 0.1%
1.3862943611
 
< 0.1%
1.6094379122
< 0.1%
ValueCountFrequency (%)
9.9551307861
< 0.1%
9.3254531791
< 0.1%
8.7225800211
< 0.1%
8.6995147481
< 0.1%
8.4441922991
< 0.1%

logreceived
Real number (ℝ≥0)

HIGH CORRELATION

Distinct3645
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.519994709
Minimum0
Maximum9.955130786
Zeros1
Zeros (%)< 0.1%
Memory size917.4 KiB
2021-04-27T00:22:01.569996image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.8978398
Q15.886104031
median6.684611728
Q37.207859871
95-th percentile7.659171368
Maximum9.955130786
Range9.955130786
Interquartile range (IQR)1.32175584

Descriptive statistics

Standard deviation0.8599985062
Coefficient of variation (CV)0.1319017184
Kurtosis-0.1883908053
Mean6.519994709
Median Absolute Deviation (MAD)0.5947071077
Skewness-0.6113915001
Sum765479.9788
Variance0.7395974306
MonotocityNot monotonic
2021-04-27T00:22:01.851180image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.9077552792785
 
2.4%
7.0900768362669
 
2.3%
6.3969296552612
 
2.2%
5.3471075312255
 
1.9%
7.2442275162181
 
1.9%
6.6846117281819
 
1.5%
6.2146080981775
 
1.5%
7.4955419441715
 
1.5%
6.8023947631547
 
1.3%
5.9914645471540
 
1.3%
Other values (3635)96507
82.2%
ValueCountFrequency (%)
01
 
< 0.1%
0.69314718063
< 0.1%
0.74193734471
 
< 0.1%
0.91629073192
< 0.1%
0.94778939891
 
< 0.1%
ValueCountFrequency (%)
9.9551307861
< 0.1%
9.3254531791
< 0.1%
8.7225800211
< 0.1%
8.6995147481
< 0.1%
8.4441707841
< 0.1%

Ratio
Real number (ℝ≥0)

SKEWED

Distinct2012
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9999890163
Minimum0.8333333333
Maximum1.06
Zeros0
Zeros (%)0.0%
Memory size917.4 KiB
2021-04-27T00:22:02.150331image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.8333333333
5-th percentile1
Q11
median1
Q31
95-th percentile1
Maximum1.06
Range0.2266666667
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.0009291046131
Coefficient of variation (CV)0.0009291148182
Kurtosis23431.76258
Mean0.9999890163
Median Absolute Deviation (MAD)0
Skewness-135.0970658
Sum117403.7105
Variance8.632353821 × 107
MonotocityNot monotonic
2021-04-27T00:22:02.384654image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1113214
96.4%
1.00006869593
 
0.1%
1.00014970167
 
0.1%
1.00000888563
 
0.1%
0.999908675842
 
< 0.1%
0.999819364235
 
< 0.1%
1.00029255332
 
< 0.1%
1.00021244331
 
< 0.1%
0.999784296830
 
< 0.1%
0.999764705929
 
< 0.1%
Other values (2002)3769
 
3.2%
ValueCountFrequency (%)
0.83333333332
< 0.1%
0.861
< 0.1%
0.91
< 0.1%
0.96272727271
< 0.1%
0.97251
< 0.1%
ValueCountFrequency (%)
1.061
< 0.1%
1.051
< 0.1%
1.031
< 0.1%
1.021
< 0.1%
1.0156521741
< 0.1%

Year
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size917.4 KiB
2018
29598 
2019
29329 
2017
29090 
2020
26981 
2016
 
2407

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters469620
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2018
2nd row2018
3rd row2019
4th row2019
5th row2020
ValueCountFrequency (%)
201829598
25.2%
201929329
25.0%
201729090
24.8%
202026981
23.0%
20162407
 
2.1%
2021-04-27T00:22:02.885896image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-04-27T00:22:03.044521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
201829598
25.2%
201929329
25.0%
201729090
24.8%
202026981
23.0%
20162407
 
2.1%

Most occurring characters

ValueCountFrequency (%)
2144386
30.7%
0144386
30.7%
190424
19.3%
829598
 
6.3%
929329
 
6.2%
729090
 
6.2%
62407
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number469620
100.0%

Most frequent character per category

ValueCountFrequency (%)
2144386
30.7%
0144386
30.7%
190424
19.3%
829598
 
6.3%
929329
 
6.2%
729090
 
6.2%
62407
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common469620
100.0%

Most frequent character per script

ValueCountFrequency (%)
2144386
30.7%
0144386
30.7%
190424
19.3%
829598
 
6.3%
929329
 
6.2%
729090
 
6.2%
62407
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII469620
100.0%

Most frequent character per block

ValueCountFrequency (%)
2144386
30.7%
0144386
30.7%
190424
19.3%
829598
 
6.3%
929329
 
6.2%
729090
 
6.2%
62407
 
0.5%

Month
Real number (ℝ≥0)

HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.637911503
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size917.4 KiB
2021-04-27T00:22:03.263255image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.401530338
Coefficient of variation (CV)0.5124398444
Kurtosis-1.178075211
Mean6.637911503
Median Absolute Deviation (MAD)3
Skewness-0.07024217255
Sum779324
Variance11.57040864
MonotocityNot monotonic
2021-04-27T00:22:03.435091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
810905
9.3%
710746
9.2%
1110512
9.0%
510287
8.8%
910108
8.6%
310042
8.6%
610015
8.5%
109948
8.5%
29535
8.1%
129389
8.0%
Other values (2)15918
13.6%
ValueCountFrequency (%)
18761
7.5%
29535
8.1%
310042
8.6%
47157
6.1%
510287
8.8%
ValueCountFrequency (%)
129389
8.0%
1110512
9.0%
109948
8.5%
910108
8.6%
810905
9.3%

Week
Real number (ℝ≥0)

HIGH CORRELATION

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.19057962
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Memory size917.4 KiB
2021-04-27T00:22:03.671741image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q114
median28
Q340
95-th percentile50
Maximum52
Range51
Interquartile range (IQR)26

Descriptive statistics

Standard deviation14.78431431
Coefficient of variation (CV)0.5437292811
Kurtosis-1.194923362
Mean27.19057962
Median Absolute Deviation (MAD)13
Skewness-0.06562574263
Sum3192310
Variance218.5759496
MonotocityNot monotonic
2021-04-27T00:22:03.952923image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
482822
 
2.4%
492713
 
2.3%
512673
 
2.3%
272539
 
2.2%
82507
 
2.1%
262502
 
2.1%
242499
 
2.1%
502496
 
2.1%
332496
 
2.1%
92492
 
2.1%
Other values (42)91666
78.1%
ValueCountFrequency (%)
1895
 
0.8%
21942
1.7%
32386
2.0%
42363
2.0%
52252
1.9%
ValueCountFrequency (%)
521073
 
0.9%
512673
2.3%
502496
2.1%
492713
2.3%
482822
2.4%

Day
Real number (ℝ≥0)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.83495592
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Memory size917.4 KiB
2021-04-27T00:22:04.204207image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.663837733
Coefficient of variation (CV)0.5471336817
Kurtosis-1.153643307
Mean15.83495592
Median Absolute Deviation (MAD)7
Skewness0.007585843684
Sum1859103
Variance75.06208426
MonotocityNot monotonic
2021-04-27T00:22:04.469767image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
134372
 
3.7%
204355
 
3.7%
194277
 
3.6%
114097
 
3.5%
124065
 
3.5%
214049
 
3.4%
274004
 
3.4%
163983
 
3.4%
243970
 
3.4%
53967
 
3.4%
Other values (21)76266
65.0%
ValueCountFrequency (%)
13433
2.9%
23446
2.9%
33557
3.0%
43636
3.1%
53967
3.4%
ValueCountFrequency (%)
312417
2.1%
303441
2.9%
293481
3.0%
283795
3.2%
274004
3.4%

Dayofweek
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.104731485
Minimum0
Maximum6
Zeros21122
Zeros (%)18.0%
Memory size917.4 KiB
2021-04-27T00:22:04.691909image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile4
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.42529502
Coefficient of variation (CV)0.677186154
Kurtosis-1.262712993
Mean2.104731485
Median Absolute Deviation (MAD)1
Skewness-0.05894428997
Sum247106
Variance2.031465893
MonotocityNot monotonic
2021-04-27T00:22:04.863742image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
425467
21.7%
324368
20.8%
222923
19.5%
122835
19.4%
021122
18.0%
5687
 
0.6%
63
 
< 0.1%
ValueCountFrequency (%)
021122
18.0%
122835
19.4%
222923
19.5%
324368
20.8%
425467
21.7%
ValueCountFrequency (%)
63
 
< 0.1%
5687
 
0.6%
425467
21.7%
324368
20.8%
222923
19.5%

Dayofyear
Real number (ℝ≥0)

HIGH CORRELATION

Distinct360
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean186.7331204
Minimum3
Maximum365
Zeros0
Zeros (%)0.0%
Memory size917.4 KiB
2021-04-27T00:22:05.105880image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile23
Q195
median190
Q3275
95-th percentile345
Maximum365
Range362
Interquartile range (IQR)180

Descriptive statistics

Standard deviation103.5740564
Coefficient of variation (CV)0.5546635548
Kurtosis-1.18949385
Mean186.7331204
Median Absolute Deviation (MAD)90
Skewness-0.06003403236
Sum21923402
Variance10727.58515
MonotocityNot monotonic
2021-04-27T00:22:05.350965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
213544
 
0.5%
354538
 
0.5%
31536
 
0.5%
340518
 
0.4%
170517
 
0.4%
233516
 
0.4%
184516
 
0.4%
318515
 
0.4%
24513
 
0.4%
52512
 
0.4%
Other values (350)112180
95.5%
ValueCountFrequency (%)
3189
0.2%
4228
0.2%
5161
0.1%
6165
0.1%
7192
0.2%
ValueCountFrequency (%)
365216
0.2%
364152
0.1%
363157
0.1%
362186
0.2%
361245
0.2%

Is_month_end
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size114.8 KiB
False
113382 
True
 
4023
ValueCountFrequency (%)
False113382
96.6%
True4023
 
3.4%
2021-04-27T00:22:05.538421image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_month_start
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size114.8 KiB
False
113972 
True
 
3433
ValueCountFrequency (%)
False113972
97.1%
True3433
 
2.9%
2021-04-27T00:22:05.645389image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_quarter_end
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size114.8 KiB
False
116556 
True
 
849
ValueCountFrequency (%)
False116556
99.3%
True849
 
0.7%
2021-04-27T00:22:05.742270image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_quarter_start
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size114.8 KiB
False
116699 
True
 
706
ValueCountFrequency (%)
False116699
99.4%
True706
 
0.6%
2021-04-27T00:22:05.835996image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_year_end
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size114.8 KiB
False
117260 
True
 
145
ValueCountFrequency (%)
False117260
99.9%
True145
 
0.1%
2021-04-27T00:22:05.929726image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Is_year_start
Boolean

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size114.8 KiB
False
117405 
ValueCountFrequency (%)
False117405
100.0%
2021-04-27T00:22:06.023451image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Elapsed
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1118
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1544076146
Minimum1480550400
Maximum1606694400
Zeros0
Zeros (%)0.0%
Memory size917.4 KiB
2021-04-27T00:22:06.185193image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1480550400
5-th percentile1487203200
Q11512345600
median1543536000
Q31574985600
95-th percentile1601337600
Maximum1606694400
Range126144000
Interquartile range (IQR)62640000

Descriptive statistics

Standard deviation36653780.4
Coefficient of variation (CV)0.02373832436
Kurtosis-1.188244809
Mean1544076146
Median Absolute Deviation (MAD)31449600
Skewness0.01692927592
Sum1.8128226 × 1014
Variance1.343499618 × 1015
MonotocityNot monotonic
2021-04-27T00:22:06.462171image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1487203200185
 
0.2%
1486512000179
 
0.2%
1487289600168
 
0.1%
1513209600167
 
0.1%
1596153600166
 
0.1%
1481068800162
 
0.1%
1486080000161
 
0.1%
1606348800160
 
0.1%
1597276800159
 
0.1%
1529020800159
 
0.1%
Other values (1108)115739
98.6%
ValueCountFrequency (%)
1480550400138
0.1%
1480636800109
0.1%
148089600098
0.1%
1480982400114
0.1%
1481068800162
0.1%
ValueCountFrequency (%)
1606694400146
0.1%
160652160013
 
< 0.1%
1606435200153
0.1%
1606348800160
0.1%
1606262400135
0.1%

Interactions

2021-04-27T00:21:08.647567image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:09.133720image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:09.433940image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:09.698447image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:09.979469image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:10.262219image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:10.526925image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:10.811322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:11.105003image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:11.391683image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:11.691130image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:11.956690image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:12.239066image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:12.504663image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:12.788035image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:13.053600image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:13.337175image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:13.602767image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:13.870846image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:14.152000image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:14.576046image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:14.842961image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:15.124136image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:15.406722image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:15.672258image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:15.956863image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:16.238050image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:16.536163image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:16.817349image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:17.117613image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:17.433193image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:17.696882image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:17.981753image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:18.261677image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:18.529601image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:18.810785image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:19.094317image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:19.375520image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:19.659223image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:19.955936image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:20.269817image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:20.519756image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:20.787518image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:21.053114image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:21.336698image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:21.602286image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:21.870236image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:22.135802image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:22.418380image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:22.714302image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:22.998681image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:23.297827image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:23.563390image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:23.831383image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:24.253160image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:24.521061image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:24.809624image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:25.084903image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:25.365241image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:25.649568image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:25.948674image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:26.198618image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:26.482265image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:26.763451image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:27.031627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:27.328430image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:27.595008image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:27.876189image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:28.159977image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:28.456812image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:28.740268image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:29.021419image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:29.305032image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:29.589034image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:29.853148image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:30.133619image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:30.433482image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:30.716969image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:30.998182image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:31.313035image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:31.594224image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:31.877851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:32.190240image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:32.488416image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:32.772420image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:33.049095image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:33.367853image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:33.633448image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:33.932656image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:34.249399image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:34.559916image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:34.841100image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:35.123607image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:35.428386image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:35.704086image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:36.159528image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:36.479377image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:36.771047image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:37.071741image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:37.367069image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:37.680763image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:37.946359image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:38.261189image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:38.558017image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:38.857153image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:39.150289image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:39.437564image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:39.736639image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:40.033444image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-04-27T00:21:40.347302image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-04-27T00:22:06.824963image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-04-27T00:22:07.607386image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-04-27T00:22:08.235628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-04-27T00:22:08.911030image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-04-27T00:22:09.880108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-04-27T00:21:41.202483image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-04-27T00:21:44.480891image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

AppliedGenderPayment_MethodLocationReceivedIdReasonAgeAreaTrue_FalseAgeGroupPayment_TypelogappliedlogreceivedRatioYearMonthWeekDayDayofweekDayofyearIs_month_endIs_month_startIs_quarter_endIs_quarter_startIs_year_endIs_year_startElapsed
0134.0FAVM134.0GHI000112669TB40-44O0MidAgeAV4.8978404.8978401.02018626250176FalseFalseFalseFalseFalseFalse1529884800
1800.0FRPNE800.0GHI000780038TB18-19Wlg0AdultRPU6.6846126.6846121.020181148304334TrueFalseFalseFalseFalseFalse1543536000
2270.0MAVM270.0GHI000437510TB35-39AM0MidAgeAV5.5984225.5984221.020193105164FalseFalseFalseFalseFalseFalse1551744000
3600.0MAVM600.0GHI000140582TB55-59C0OldAV6.3969306.3969301.020191531331TrueFalseFalseFalseFalseFalse1548892800
4270.0FAVM270.0GHI001420547TB30-34C0MidAgeAV5.5984225.5984221.0202093792253FalseFalseFalseFalseFalseFalse1599609600
51360.0FRPPP1360.0GHI000096091TB25-29EC0AdultRPU7.2152407.2152401.0201793684251FalseFalseFalseFalseFalseFalse1504828800
62020.0FAVPP2020.0GHI000157538TB35-39AM0MidAgeAV7.6108537.6108531.02019520141134FalseFalseFalseFalseFalseFalse1557792000
7160.0MAVNE160.0GHI000134666TB40-44S0MidAgeAV5.0751745.0751741.02017728111192FalseFalseFalseFalseFalseFalse1499731200
8960.0MAVM960.0GHI000210735TB45-49T0MidAgeAV6.8669336.8669331.02018104180281FalseFalseFalseFalseFalseFalse1538956800
9400.0FAVM400.0GHI001182494TB20-24C0AdultAV5.9914655.9914651.020171147200324FalseFalseFalseFalseFalseFalse1511136000

Last rows

AppliedGenderPayment_MethodLocationReceivedIdReasonAgeAreaTrue_FalseAgeGroupPayment_TypelogappliedlogreceivedRatioYearMonthWeekDayDayofweekDayofyearIs_month_endIs_month_startIs_quarter_endIs_quarter_startIs_year_endIs_year_startElapsed
1173951800.0FAVU1800.0GHI000609805TB20-24AM0AdultAV7.4955427.4955421.0202031220480FalseFalseFalseFalseFalseFalse1584662400
1173961200.0FAVO1200.0GHI000327936TB20-24S0AdultAV7.0900777.0900771.020181043264299FalseFalseFalseFalseFalseFalse1540512000
1173971490.0FAVM1490.0GHI000886793TB30-34BP0MidAgeAV7.3065317.3065311.020171318218FalseFalseFalseFalseFalseFalse1484697600
117398880.0FAVNE880.0GHI000431616TB50-54T0OldAV6.7799226.7799221.02018728134194FalseFalseFalseFalseFalseFalse1531440000
1173991000.0FAVM1000.0GHI000102929TB65+O0OldAV6.9077556.9077551.02018267238FalseFalseFalseFalseFalseFalse1517961600
1174001050.0FAVNE1050.0GHI000263441TB45-49BP0MidAgeAV6.9565456.9565451.020173106065FalseFalseFalseFalseFalseFalse1488758400
117401680.0MRPM680.0GHI000142341TB25-29C0AdultRPU6.5220936.5220931.02017835280240FalseFalseFalseFalseFalseFalse1503878400
117402630.0MAVNE630.0GHI000099606TB45-49Wlg0MidAgeAV6.4457206.4457201.02017626282179FalseFalseFalseFalseFalseFalse1498608000
117403216.0MAVM216.0GHI000135471TB45-49O0MidAgeAV5.3752785.3752781.020191147224326FalseFalseFalseFalseFalseFalse1574380800
117404340.0FAVNE340.0GHI001140992TB65+C0OldAV5.8289465.8289461.0201751854125FalseFalseFalseFalseFalseFalse1493942400

Duplicate rows

Most frequent

AppliedGenderPayment_MethodLocationReceivedIdReasonAgeAreaTrue_FalseAgeGroupPayment_TypelogappliedlogreceivedRatioYearMonthWeekDayDayofweekDayofyearIs_month_endIs_month_startIs_quarter_endIs_quarter_startIs_year_endIs_year_startElapsedcount
120208.0FAVM208.0GHI000112669TB20-24O0AdultAV5.3375385.3375381.0201731220079FalseFalseFalseFalseFalseFalse14899680006
125208.0FAVM208.0GHI000112669TB25-29O0AdultAV5.3375385.3375381.020161251212356FalseFalseFalseFalseFalseFalse14822784005
157208.0FAVM208.0GHI000112669TB30-34O0MidAgeAV5.3375385.3375381.020161251212356FalseFalseFalseFalseFalseFalse14822784005
167208.0FAVM208.0GHI000112669TB30-34O0MidAgeAV5.3375385.3375381.0201731010469FalseFalseFalseFalseFalseFalse14891040005
171208.0FAVM208.0GHI000112669TB30-34O0MidAgeAV5.3375385.3375381.0201731222281FalseFalseFalseFalseFalseFalse14901408005
327210.0FAVM210.0GHI000112669TB25-29O0AdultAV5.3471085.3471081.020171251224356FalseFalseFalseFalseFalseFalse15139008005
100208.0FAVM208.0GHI000112669TB20-24O0AdultAV5.3375385.3375381.02016124972342FalseFalseFalseFalseFalseFalse14810688004
109208.0FAVM208.0GHI000112669TB20-24O0AdultAV5.3375385.3375381.020171424124FalseFalseFalseFalseFalseFalse14852160004
123208.0FAVM208.0GHI000112669TB25-29O0AdultAV5.3375385.3375381.02016124972342FalseFalseFalseFalseFalseFalse14810688004
135208.0FAVM208.0GHI000112669TB25-29O0AdultAV5.3375385.3375381.020172610441FalseFalseFalseFalseFalseFalse14866848004